Coalesced pipelined SMB I/O for higher 10G throughput (v0.5.3) by lukekim · Pull Request #22 · spiceai/spiceio

lukekim · 2026-05-14T08:25:52Z

Summary

Coalesced pipelined SMB I/O: pipelined_write and pipelined_read now build the entire batch into one BytesMut, sign each packet in-place, and emit a single write_all per batch — eliminates 64 per-packet to_vec allocations and collapses 64 write_all syscalls into 1. Encode-side CPU drops from 154 µs to 49 µs at the typical d64×64 KiB working point (3.1× faster), on top of the syscall reduction.
Zero-copy read decode: new decode_read_response_from_msg takes the owned response Vec and returns a Bytes slice over it — eliminates the per-response body.to_vec(), saving ~4 MiB of memcpy per 64-deep batch.
GetObject streaming channel sized to the SMB pipeline depth: was 4, now READ_PIPELINE_DEPTH. A full pipeline batch dumps into the channel without blocking, so back-to-back SMB read batches overlap with HTTP draining instead of serializing per-chunk.
Bench script upgrades (scripts/bench-live.sh): adds concurrent multi-stream PUT/GET (BENCH_CONCURRENCY, default 8) — the test that actually exercises a 10G pipe — and an optional raw mount_smbfs baseline (BENCH_MOUNT_BASELINE=1) to quantify the spiceio translation overhead against the link ceiling.
New micro-benches: pipelined_write_encode_coalesced and pipelined_read_decode_zerocopy track the optimized paths.
Bumps version to v0.5.3.

Test plan

make lint — fmt, clippy (strict), rustdoc warnings all clean
cargo test --locked — 145/145 unit tests pass (3 new tests cover the zero-copy decoder, including overflow rejection)
cargo bench --bench protocol_bench -- pipelined — confirms the 3.1× encode speedup at d64×64 KiB and ~2.1× at d64×1 MiB; no regression on existing benches
On a 10G-attached NAS: BENCH_CONCURRENCY=16 BENCH_MOUNT_BASELINE=1 ./scripts/bench-live.sh — verify the concurrent PUT/GET aggregate approaches the mount_smbfs ceiling and the single-stream numbers are no worse than before
CI sccache + extended + stress integration tests pass against the runner NAS

Reworks the SMB pipelined-read and pipelined-write paths to build all batch packets into one contiguous BytesMut, sign each in-place, and emit a single write_all per batch — eliminating 64 per-packet to_vec allocations and collapsing 64 write_all syscalls per batch into 1. Adds a zero-copy read response decoder that slices an owned Vec into Bytes without the prior body.to_vec() — saves ~4 MiB of memcpy per 64-deep batch at 64 KiB chunks. Sizes the GetObject streaming channel to READ_PIPELINE_DEPTH so a full pipeline batch can dump into the channel without blocking, letting back-to-back SMB batches overlap HTTP draining. Extends bench-live.sh with concurrent multi-stream PUT/GET (BENCH_CONCURRENCY) and an optional raw mount_smbfs baseline (BENCH_MOUNT_BASELINE) to quantify the spiceio translation overhead against the link ceiling. Adds matching protocol micro-benches. Microbench (pipelined_write_encode, d64 x 64 KiB): 154 us -> 49 us, ~3.1x faster on the CPU side, on top of the 64 -> 1 syscall reduction.

Copilot

Pull request overview

This PR optimizes the SMB read/write pipelining hot paths to reduce per-packet allocations, memcpys, and syscalls, and updates the S3 GetObject streaming path and benchmarking tooling to better target 10G-throughput scenarios.

Changes:

Coalesce pipelined SMB read/write request batches into a single BytesMut and sign packets in-place before a single write_all.
Add a zero-copy read-response decoder that slices payload bytes directly from the owned SMB2 message buffer.
Size GetObject’s streaming channel to the SMB pipeline depth and enhance live/criterion benchmarks; bump version to 0.5.3.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/smb/protocol.rs	Adds `decode_read_response_from_msg` and unit tests for zero-copy read payload extraction.
src/smb/ops.rs	Exposes `READ_PIPELINE_DEPTH` for cross-layer coordination (SMB ↔ HTTP streaming).
src/smb/client.rs	Implements coalesced pipelined read/write encoding + in-place signing; uses new zero-copy decoder in pipelined reads.
src/s3/router.rs	Sizes GetObject streaming channel to SMB pipeline depth to improve overlap between SMB reads and HTTP writes.
scripts/bench-live.sh	Adds concurrent PUT/GET benchmarks and optional `mount_smbfs` baseline mode.
benches/protocol_bench.rs	Adds micro-benches for coalesced pipelined write encoding and zero-copy pipelined read decode.
Cargo.toml	Version bump to `0.5.3`.
Cargo.lock	Lockfile version bump to `0.5.3`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot AI review requested due to automatic review settings May 14, 2026 08:25

Copilot started reviewing on behalf of lukekim May 14, 2026 08:26 View session

Copilot AI reviewed May 14, 2026

View reviewed changes

Comment thread src/smb/protocol.rs

Comment thread src/s3/router.rs

Comment thread scripts/bench-live.sh

Comment thread scripts/bench-live.sh

lukekim self-assigned this May 14, 2026

lukekim added the enhancement New feature or request label May 14, 2026

lukekim enabled auto-merge (squash) May 14, 2026 08:34

lukekim merged commit 5736b5f into trunk May 14, 2026
8 checks passed

lukekim deleted the worktree-robust-beaming-pizza branch May 14, 2026 08:37

This was referenced May 14, 2026

Fix sharing-violation 500s + carry forward review fixes (v0.5.4) #23

Merged

Floor zero negotiated I/O sizes + tighten read decoder bounds (v0.5.5) #24

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Coalesced pipelined SMB I/O for higher 10G throughput (v0.5.3)#22

Coalesced pipelined SMB I/O for higher 10G throughput (v0.5.3)#22
lukekim merged 1 commit into
trunkfrom
worktree-robust-beaming-pizza

lukekim commented May 14, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lukekim commented May 14, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants